Idetifying Complex Unknown Audio
نویسندگان
چکیده
As the internet search evolves toward multimedia content based search, audio content identification and retrieval will likely become one of the key components of next generation internet search machines. In this paper we develop and evaluate a methodology for the identification (classification) of unknown highly complex audio pieces from within the same genre (classical music). Our methodology combines a novel audio content descriptor, the wavelet dispersion vector, with neural net assessment of the similarity between unknown query vectors and known (example set) vectors. We define the wavelet dispersion vector as the histogram of the rank orders obtained by the wavelet coefficients of a given wavelet scale among all the coefficients (of all scales at a given time instant). We demonstrate that the wavelet dispersion vector precisely characterizes the audio content while achieving good generalization. We examine the identification performance of a combination of 39 different wavelets and three different types of neural nets. We find that our wavelet dispersion vector calculated with a biorthogonal wavelet in conjunction with a probabilistic radial basis neural net trained by only three independent example sets correctly identifies approximately 78% of the unknown audio query pieces.
منابع مشابه
راهکار جدید استخراج ویژگی مبتنی بر نمونهبرداری فشرده در پردازش سیگنالهای صوتی
In this paper, we present a Compressive Sampling (CS)-based feature extraction method for audio signals. In the proposed approach, the audio signal is firstly segmented by hamming windows and the Discrete Fourier Transform (DFT) of the samples is calculated within each frame. Then, the normalized values of the DFT coefficients of each frame are accumulated. At the next step, the second DFT is a...
متن کاملInvestigation of Noise Levels in Sugar Factory of Debal Khozaei Agro-industry Complex
Introduction and purpose: Noise pollution can exert negative effects on mental health. In this regard, the present study aimed to evaluate the noise levels in sugar factory of Debal Khozaei agro-industry complex in 2018. Methods: For the current study, sound and audio parameters were measured using a sound meter. These audio parameters included sound pressure level and minimum and maximum valu...
متن کاملNAVIG: Navigation Assisted by Artificial Vision and GNSS
Finding ones way to an unknown destination, navigating complex routes, finding desired inanimate objects; these are all tasks that can be challenging for the visually impaired. The project NAVIG (Navigation Assisted by artificial VIsion and GNSS) is directed towards increasing the autonomy of visually impaired users in known and unknown environments, exterior and interior, large scale and small...
متن کاملThe Tanner-Culbertson Aural Method of Voice Identification: A Reasonable Alternative to Spectrographic Voice Prints
The use of voice prints, or spectrographic voice identification, has been around for several decades. This highly developed technology uses complex algorithms to break the speech signal down into its component parts: time, frequency, and energy. The result is a spectrogram, also known as a voice print, that allows visual or computer comparisons between two or more audio samples. Voice prints ar...
متن کاملAdaptive Signal Detection in Auto-Regressive Interference with Gaussian Spectrum
A detector for the case of a radar target with known Doppler and unknown complex amplitude in complex Gaussian noise with unknown parameters has been derived. The detector assumes that the noise is an Auto-Regressive (AR) process with Gaussian autocorrelation function which is a suitable model for ground clutter in most scenarios involving airborne radars. The detector estimates the unknown...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004